🎮 Reinforcement Learning - laurynas · Scour

Control Reinforcement Learning: Token-Level Mechanistic Analysis via Learned SAE Feature Steering

arxiv.org·14h

⚙Context engineering

Playing 20 Question Game with Policy-Based Reinforcement Learning

arxiv.org·2d

Show HN: Fighting the War Against Expensive Reinforcement Learning

cadenza-landing-qtu7gbjwb-akshparekh123-3457s-projects.vercel.app·12h·

Discuss: Hacker News

⚙Context engineering

Recursive self-improvement from AI models

marginalrevolution.com·2d·

Discuss: Hacker News

Your AI Strategy Has a Human-Shaped Hole

superiortech.io·5h·

Discuss: Hacker News

Why AI Breaks Down Without Real-Time Data in Defense Operations

singlestore.com·4h

ashworks1706/rlhf-from-scratch: A theoretical and practical deep dive into Reinforcement Learning with Human Feedback and it’s applications in Large Language Models from scratch.

github.com·2d·

Discuss: Hacker News

⚙Context engineering

Show HN: A minimal online decision maker

decisionmaker.online·1d·

Discuss: Hacker News

👆human-computer interaction

Part 2 - AI Chat Evaluation of the Formal Language in He Xin's PEPC System

news.ycombinator.com·1d·

Discuss: Hacker News

🤝Multi-Agent Systems

Digitizing the "Shokunin": How we encoded a Master's hammer strike into AI

yusukekaizen.substack.com·12h·

Discuss: Substack

Task-Completion Time Horizons of Frontier AI Models

metr.org·1d·

Discuss: Hacker News

Architectural and Mathematical Foundations of Machine Learning: A Rigorous Synthesis of Theory, Geometry, and Implementation

chizkidd.github.io·1d·

Discuss: Hacker News

Cyber Model Arena

wiz.io·3h·

Discuss: Hacker News

Schedules of Reinforcement in Psychology (Examples)

simplypsychology.org·2d·

Discuss: Hacker News

⚙Context engineering

Robots That Can See Around Corners Using Radio Signals and AI

seas.upenn.edu·1d·

Discuss: Hacker News

3D Tissue Braiding – a new, simpler way to build robotics

allonic.co·2d·

Discuss: Hacker News

⚙Context engineering

Outcome Engineering

o16g.com·1d·

Discuss: Hacker News

⚙Context engineering

Self-Referential Quantum Barriers for AGI Containment

redact-app.com·1d·

Discuss: Hacker News

⚙Context engineering

☞ Maxis Software Toys

arbesman.substack.com·1d·

Discuss: Substack

New ARIA research funding programme: nearly £50M to secure AI agents in the wild

aria.org.uk·2d·

Discuss: Hacker News

Loading more...